euMMD: efficiently computing the MMD two-sample test statistic for univariate data

نویسندگان

چکیده

Abstract The maximum mean discrepancy (MMD) test is a nonparametric kernelised two-sample that, when using characteristic kernel, can detect any distributional change between two samples. However, the total number of $$d$$ d -dimensional observations $$n$$ n , direct computation statistic $$\mathcal {O}(dn^2 )$$ O ( 2 ) . While approximations with lower computational complexity are known, more efficient methods for computing exact unknown. This paper provides an method MMD univariate case in {O}(n\log n)$$ log Laplacian kernel. Furthermore, this extended to approximate real-valued data also log-linear observations. Experiments show that have good statistical performance compared test, particularly cases where $$d> n$$ >

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asymptotic algorithm for computing the sample variance of interval data

The problem of the sample variance computation for epistemic inter-val-valued data is, in general, NP-hard. Therefore, known efficient algorithms for computing variance require strong restrictions on admissible intervals like the no-subset property or heavy limitations on the number of possible intersections between intervals. A new asymptotic algorithm for computing the upper bound of the samp...

متن کامل

Two-Sample Median Test for Vague Data

Classical statistical tests may be sensitive to violations of the fundamental model assumptions inherent in the derivation and construction of these tests. It is obvious that such violations are much more probable in the presence of vague data. Thus nonparametric tests seem to be promising statistical tools. A generalization of the median test for the two-sample problem with vague data is sugge...

متن کامل

Test for Exponentiality Based on the Sample Covariance

This paper proposes a simple goodness-of-fit test based on the sample covariance. It is shown that this test is preferable for alternatives of increasing and unimodal failure rate. Critical values for various sample sizes are determined by means of Monte Carlo simulations. We compare the test based on the sample covariance with tests based on Hoeffding's maximum correlation. The usefulness o...

متن کامل

A multivariate two-sample mean test for small sample size and missing data.

We develop a new statistic for testing the equality of two multivariate mean vectors. A scaled chi-squared distribution is proposed as an approximating null distribution. Because the test statistic is based on componentwise statistics, it has the advantage over Hotelling's T2 test of being applicable to the case where the dimension of an observation exceeds the number of observations. An appeal...

متن کامل

Efficiently Computing Data-Independent Memory-Hard Functions

A memory-hard function (MHF) f is equipped with a space cost σ and time cost τ parameter such that repeatedly computing fσ,τ on an application specific integrated circuit (ASIC) is not economically advantageous relative to a general purpose computer. Technically we would like that any (generalized) circuit for evaluating an iMHF fσ,τ has area × time (AT) complexity at Θ(σ ∗ τ). A data-independe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Statistics and Computing

سال: 2023

ISSN: ['0960-3174', '1573-1375']

DOI: https://doi.org/10.1007/s11222-023-10271-x